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Abstract. Continuously indexed datasets with multiple variables have 
become ubiquitous in the geophysical, ecological, environmental and cli¬ 
mate sciences, and pose substantial analysis challenges to scientists and 
statisticians. For many years, scientists developed models that aimed 
at capturing the spatial behavior for an individual process; only within 
the last few decades has it become commonplace to model multiple 
processes jointly. The key difficulty is in specifying the cross-covariance 
function, that is, the function responsible for the relationship between 
distinct variables. Indeed, these cross-covariance functions must be 
chosen to be consistent with marginal covariance functions in such a 
way that the second-order structure always yields a nonnegative def¬ 
inite covariance matrix. We review the main approaches to building 
cross-covariance models, including the linear model of coregionaliza¬ 
tion, convolution methods, the multivariate Matern and nonstation¬ 
ary and space-time extensions of these among others. We additionally 
cover specialized constructions, including those designed for asymme¬ 
try, compact support and spherical domains, with a review of physics- 
constrained models. We illustrate select models on a bivariate regional 
climate model output example for temperature and pressure, along 
with a bivariate minimum and maximum temperature observational 
dataset; we compare models by likelihood value as well as via cross- 
validation co-kriging studies. The article closes with a discussion of 
unsolved problems. 
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1. INTRODUCTION 
1.1 Motivation 

The occurrence of multivariate data indexed by 
spatial coordinates in a large number of applications 
has prompted sustained interest in statistics in re¬ 
cent years. For instance, in environmental and cli¬ 
mate sciences, monitors collect information on mul- 
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tiple variables such as temperature, pressure, wind 
speed and direction and various pollutants. Simi¬ 
larly, the output of climate models generate multi¬ 
ple variables, and there are multiple distinct climate 
models. Physical models in computer experiments 
often involve multiple processes that are indexed by 
not only space and time, but also parameter set¬ 
tings. With the increasing availability and scientific 
interest in multivariate processes, statistical science 
faces new challenges and an expanding horizon of 
opportunities for future exploration. 

Geostatistical applications mainly focus on inter¬ 
polation, simulation or statistical modeling. Inter¬ 
polation or smoothing in spatial statistics usually is 
synonymous with kriging, the best linear unbiased 
prediction under squared loss (Cressie, 1993). With 
multiple variables, interpolation becomes a multi¬ 
variate problem, and is traditionally accommodated 
via co-kriging, the multivariate extension of krig¬ 
ing. Co-kriging is often particularly useful when one 
variable is of primary importance, but is correlated 
with other types of processes that are more read¬ 
ily observed (Almeida and Journel (1994); Wacker- 
nagel (1994); Journel (1999); Shmaryan and Journel 
(1999); Subramanyam and Pandalai (2008)). Much 
expository work has been developed on co-kriging, 
see Myers (1982, 1983, 1991, 1992), Long and Myers 
(1997), Furrer and Genton (2011) and Sang, Jun and 
Huang (2011) for discussion and technical details. 

Consider a p-dimensional multivariate random 
field Z(s) = (Zi(s),. .., Z p (s)} T defined on M d , 
d > 1, where Zi(s) is the zth process at location 
s, for i = l,...,p. If Z(s) is assumed to be a 
Gaussian multivariate random field, then only its 
mean vector /x(s) = E{Z(s)} and cross-covariance 
matrix function C(si,S 2 ) = cov{Z(si), Z(s 2 )} = 
{(7iy(si, S 2 )}^- =1 composed of functions 

(1) Cij(si,s 2 ) = cov{Z i (si),Z,-(s 2 )}, Si, s 2 € R d , 

for i,j = 1,... ,p, need to be described to fully spec¬ 
ify the multivariate random field. Authors typically 
refer to Cij as direct- or marginal-covariance func¬ 
tions for i = j , and cross-covariance functions for i ^ 
j. Here, we assume that Z(s) is a mean zero process. 
The quantities Pij(s l ,s 2 ) = s 2 )/{Cu(si, Si) • 

Cjj{ s 2 ,s 2 )} 1/2 are the cross-correlation functions. 
Our goal is then to construct valid and flexible 
cross-covariance functions (1), that is, the matrix¬ 
valued mapping M pxp , where M pxp 

is the set of p x p real-valued matrices, must 
be nonnegative definite in the following sense. 


The covariance matrix H of the random vector 

C(si,s n )\ 
C(s 2 ,s n ) 

• 5 

C(s n , s n )) 

should be nonnegative definite: a T Sa > 0 for any 
vector a E R np , any spatial locations si,..., s n , and 
any integer n. Fanshawe and Diggle (2012) reviewed 
approaches for the bivariate case p = 2, although 
most techniques can be readily extended to p > 2, 
and Alvarez, Rosasco and Lawrence (2012) reviewed 
approaches for machine learning. 

A multivariate random field is second-order sta¬ 
tionary (or just stationary) if the marginal and 
cross-covariance functions depend only on the sepa¬ 
ration vector h = si — s 2 , that is, there is a mapping 
Cij : R d —X R such that 

cov{Z i (s 1 ),Z j (s 2 )} = Cij{ h), h£l d . 

Otherwise, the process is nonstationary. Stationarity 
can be thought of as an invariance property under 
the translation of coordinates. A test for the station¬ 
arity of a multivariate random field can be found in 
Jun and Genton (2012). 

A multivariate random field is isotropic if it is 
stationary and invariant under rotations and reflec¬ 
tions, that is, there is a mapping C tJ : R + U {0} — » R 
such that 

cov{Zi(s\), Zj (s 2 )} = CV,(||h||), heR d , 

where || • || denotes the Euclidean norm. Other¬ 
wise, the multivariate random field is anisotropic. 
Isotropy or even stationarity are not always realistic, 
especially for large spatial regions, but sometimes 
are satisfactory working assumptions and serve as 
basic elements of more sophisticated anisotropic and 
nonstationary models. 

In the univariate setting, variograms are often 
the main focus in geostatistics, and are defined as 
the variance of contrasts. Variograms can be ex¬ 
tended to multivariate random fields in two ways: 
A covariance-based cross-variogram (Myers, 1982) 
defined as 

(3) cov{Zj(si) - Zi( s 2 ), Zj( Si) - Zj{ s 2 )}, 

Si,S 2 £K d , and a variance-based cross-variogram 
(Myers, 1991), also coined pseudo cross-variogram, 

(4) var{Zi(si)-Z,-(s 2 )}, s 1 ,s 2 eK ci . 


{Z(s 1 ) T ,...,Z(s n ) T } T GM^: 

/C(si,si) C(si,s 2 ) 
C(s 2 ,si) C(s 2 ,s 2 ) 

5 ] — 

\C(s n ,si) C(s n ,s 2 ) 
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The corresponding stationary versions are immedi¬ 
ate. Cressie and Wikle (1998) reviewed the differ¬ 
ences between (3) and (4), and argued that (4) is 
more appropriate for co-kriging because it yields the 
same optimal co-kriging predictor as the one ob¬ 
tained with the cross-covariance function C l3 in (1); 
see also Ver Hoef and Cressie (1993) and Huang, 
Yao, Cressie and Hsing (2009). Unfortunately, the 
interpretation of cross-variograms is difficult, and 
so most authors favor working with covariance and 
cross-covariance formulations. 

1.2 Properties of Cross-Covariance Matrix 
Functions 

Because the covariance matrix X in (2) must 
be symmetric, the matrix functions must satisfy 
C(si,S 2 ) = C(s 2 ,si) t , or C(h) = C(—h) T under 
stationarity. 

Therefore, cross-covariance matrix functions are not 
symmetric in general, that is, 

Qj(si,s 2 ) = cov{Zi(si),Zj(s 2 )} 

+ cov{Zj(si), Z { ( s 2 )} = Cj-j(si,s 2 ), 

si,s 2 € W l , unless the cross-covariance functions 
themselves are all symmetric (Wackernagel, 2003). 
However, the collocated matrices C(s,s), or C(0) 
under stationarity, are symmetric and nonnegative 
definite. 

The marginal and cross-covariance functions sat¬ 
isfy | Cij (si, s 2 ) | 2 < Cjj(si,Si)Cjj(s 2 ,s 2 ), or 
|CV,(h)| 2 < Ca(0)Cjj(0 ) under stationarity. How¬ 
ever, |Cjj(si,s 2 )| need not be less than or equal 
to Cij{ si,si), or |C'iy(h)| need not be less than or 
equal to CV,(0) under stationarity. This is because 
the maximum value of Cij( h) is not restricted to 
occur at h = 0, unless i = j, and in fact this some¬ 
times occurs in practice (Li and Zhang, 2011). Thus, 
there are no similar bounds between |CV,(si,s 2 )| 2 
and Cjj(si,s 2 )(7jj(si,s 2 ), or between |CV,(h)| 2 and 
Cn(h)Cjj( h) under stationarity. 

A cross-covariance matrix function is separable if 

(5) Qj(si,s 2 ) = p(s 1 ,s 2 )Rij, s 1 ,s 2 eM d , 

for all i,j = 1,... ,p, where p(si, s 2 ) is a valid, non¬ 
stationary or stationary, correlation function and 
Rij = cov(Zi,Zj) is the nonspatial covariance be¬ 
tween variables i and j. Mardia and Goodall (1993) 
introduced and used separability to model multivari¬ 
ate 

spatio-temporal data, and Bhat, Haran and Goes 


(2010) used separable covariances in the context 
of computer model calibration. In the past, sep¬ 
arable cross-covariance structures were sometimes 
called intrinsic coregionalizations (Helterbrand and 
Cressie, 1994). 

With a large number of processes, detecting struc¬ 
tures of the multivariate random process such as 
symmetry and separability can be difficult via ele¬ 
mentary data analytic techniques. Li, Genton and 
Sherman (2008) proposed an approach based on 
the asymptotic distribution of the sample cross¬ 
covariance estimator to test these various structures. 
Their methodology allows the practitioner to assess 
the underlying dependence structure of the data and 
to suggest appropriate cross-covariance functions, 
an important part of model building. 

In the special case of stationary matrix-valued co- 
variance functions, there is an intimate link between 
the cross-covariance matrix function and its spec¬ 
tral representation. In particular, define the cross- 
spectral densities fij : —>• M as 

fiM = l 7^3 [ h)dh, us € M d , 

( 27 r)“ J^d 

where t = \J— 1 is the imaginary number. A nec¬ 
essary and sufficient condition for C(-) to be a 
valid (i.e., nonnegative definite), stationary matrix¬ 
valued covariance function is for the matrix func¬ 
tion f(u?o) = {/jj(wo)}f -_i to be nonnegative def¬ 
inite for any u 0 (Cramer, 1940). While Cramer’s 
original result is stated in terms of measures of 
bounded variation, in practice using spectral den¬ 
sities is preferred. This can be viewed as a mul¬ 
tivariate extension of Bochner’s celebrated theo¬ 
rem (Bochner, 1955). The analogue of Schoen¬ 
berg’s theorem for multivariate random fields, that 
is, Bochner’s theorem for isotropic cross-covari¬ 
ance functions, has recently been investigated by 
Alonso-Malaver, Porcu and Giraldo (2013, 2015). 

1.3 Estimation of Cross-Covariances 


The empirical estimator of the cross-covariance 
matrix function of a stationary multivariate random 
field is 


( 6 ) 


C(h) 


1 

WM 


E { z ( s fc)-z} 

(k,l)GN{ h) 


•{Z(sO-Z} T , 


h € where IV(h) = {(k,l) |sfc — = h}, |lV(h)| 

denotes its cardinality, and Z = -f Ylk=i Z(s&) is the 
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sample mean vector. A valid parametric model is 
then typically fit by least squares methods to the 
empirical estimates in (6). Alternatively, one can use 
likelihood-based methods or the Bayesian paradigm 
(Brown, Le and Zidek, 1994). In any case, valid and 
flexible cross-covariance models are needed. Kiinsch, 
Papritz and Bassi (1997) studied generalized cross¬ 
covariances and their estimation. 

Papritz, Kiinsch and Webster (1993) discussed 
empirical estimators of the cross-variogram (3) and 
(4). Unlike the pseudo cross-variogram, the cross- 
variogram (3) has the disadvantage that it cannot 
be estimated when the variables are not observed 
at the same spatial locations. Lark (2003) proposed 
two outlier-robust estimators of the pseudo cross- 
variogram (4) and applied them in a multivari¬ 
ate geostatistical analysis of soil properties. Fur- 
rer (2005) studied the bias of the empirical cross¬ 
covariance matrix C(0) estimation under spatial de¬ 
pendence using both fixed-domain and increasing- 
domain asymptotics. Lim and Stein (2008) investi¬ 
gated a spectral approach based on spatial cross- 
periodograms for data on a lattice and studied their 
properties using fixed-domain asymptotics. 

2. CROSS-COVARIANCES BUILT FROM 
UNIVARIATE MODELS 

The most common approach to building cross¬ 
covariance functions is by combining univariate co- 
variance functions. The three main options in this 
vein are the linear model of coregionalization, var¬ 
ious convolution techniques and the use of latent 
dimensions. 

2.1 Linear Model of Coregionalization 

Probably the most popular approach of com¬ 
bining univariate covariances is the so-called lin¬ 
ear model of coregionalization (LMC) for station¬ 
ary random fields (Bourgault and Marcotte (1991); 
Goulard and Voltz, 1992; Grzebyk and Wackernagel 
(1994); Vargas-Guzman, Warrick and Myers (2002); 
Schmidt and Gelfand (2003); Wackernagel, 2003). 
It consists of representing the multivariate random 
held as a linear combination of r independent uni¬ 
variate random fields. The resulting cross-covariance 
functions take the form 

r 

(T) C„( h) = £ Pfc(h)AjfcA j/j, h € M. , 

k =1 

for an integer 1 < r < p, where Pk(') are valid sta¬ 
tionary correlation functions and A = {Aij)?’J =1 is 


a p x r full rank matrix. When r = 1, the cross¬ 
covariance function (7) is separable as in (5). The 
allure of this approach is that only r univariate co- 
variances /Ofc(h) must be specified, thus avoiding di¬ 
rect specification of a valid cross-covariance matrix 
function. The LMC can additionally be built from a 
conditional perspective (Royle and Berliner (1999); 
Gelfand et al. (2004)). Note that the discrete sum 
representation (7) can also be interpreted as a scale 
mixture (Porcu and Zastavnyi, 2011). 

With a large number of processes, the number of 
parameters can quickly become unwieldy and the re¬ 
sulting estimation difficult. Zhang (2007) described 
maximum likelihood estimation of the spatial LMC 
based on an EM algorithm, whereas Schmidt and 
Gelfand (2003) proposed a Bayesian coregionaliza¬ 
tion approach with application to multivariate pol¬ 
lutant data. A second drawback of the LMC is that 
the smoothness of any component of the multivari¬ 
ate random field is restricted to that of the roughest 
underlying univariate process. 

2.2 Convolution Methods 

Convolution methods fall into the two categories 
of kernel and covariance convolution. The kernel 
convolution method (Ver Hoef and Barry (1998); 
Ver Hoef, Cressie and Barry (2004)) uses 

Cij (h) 

hi (vi )kj (v 2 )/?(vi - v 2 + h) dv! dv 2 , 

si,s 2 € where the Ly are square integrable ker¬ 
nel functions and /?(•) is a valid stationary correla¬ 
tion function. This approach assumes that all the 
spatial processes Zi( s), for i = 1,... ,p, are gener¬ 
ated by the same underlying process, which is very 
restrictive in that it imposes strong dependence be¬ 
tween all constituent processes Z*(s). Overall, this 
approach and its parameters can be difficult to in¬ 
terpret and, except for some special cases, requires 
numerical integration. 

Covariance convolution for stationary spatial ran¬ 
dom fields (Gaspari and Cohn (1999); Gaspari et al. 
(2006); Majumdar and Gelfand (2007)) yields 

Cij{ h)= [ Q (h — k) Cj (k) dk, h e M. d , 

where Ci are square integrable functions. Although 
some closed-form expressions exist, this method usu¬ 
ally requires numerical integration. A particularly 


CROSS-COVARIANCE FUNCTIONS 


5 


useful example of a closed form solution is when the 
Ci are Matern correlation functions with common 
scale parameters. In this setup, Matern correlations 
are closed under convolution and this approach re¬ 
sults in a special case of the multivariate Matern 
model (Gneiting, Kleiber and Schlather, 2010). 

2.3 Latent Dimensions 


Another approach to build valid cross-covariance 
functions based on univariate (p = 1) spatial covari¬ 
ances was put forward by Apanasovich and Gen- 
ton (2010) (see also Porcu and Zastavnyi (2011)). 
Their idea was to create additional latent dimen¬ 
sions that represent the various variables to be mod¬ 
eled. Specifically, each component i of the multi¬ 
variate random held Z(s) is represented as a point 
£i = (& 1 >---,&fc) T in ^ i = l,...,p, for an in¬ 
teger 1 < k < p, yielding the marginal and cross¬ 
covariance functions 


(8) C ij (s 1 ,s 2 ) = C'{(si,^),(s 2 ,^ i )}, si,s 2 €M d , 

where C is a valid univariate covariance function on 
R d+fc ; see Gneiting, Genton and Guttorp (2007) for 
a review of possible univariate covariance functions. 
It is immediate that the resulting cross-covariance 
matrix XI in (2) is nonnegative definite because 
its entries are defined through a valid univariate 
covariance. If the covariance C is from a station¬ 
ary or isotropic univariate random held, then so is 
also the cross-covariance function (8); for instance, 

r,,(h) ro>.£,- ^). 

As an example of the aforementioned construc¬ 
tion, Apanasovich and Genton (2010) suggested 


Cij( h) 
(9) 


(TjCTj f — a||h|| 

n«.-yi + i oxp t die, - £,ii + W 2 

+ T 2 I(i = j)I{h = 0), h€M d , 


where /(•) is the indicator function, Cj > 0 are 
marginal standard deviations, r > 0 is a nugget ef¬ 
fect, and a > 0 is a length scale. Here, (5 € [0,1] con¬ 
trols the nonseparability between space and vari¬ 
ables, with f3 = 0 being the separable case. The 
parameters of the model are estimated by maxi¬ 
mum likelihood or composite likelihood methods. 
Apanasovich and Genton (2010) provided an appli¬ 
cation to a trivariate pollution dataset from Cali¬ 
fornia. Further use of latent dimensions for multi¬ 
variate spatio-temporal random helds are discussed 
in Section 7.2. The idea of latent dimensions was 
recently extended to modeling nonstationary pro¬ 
cesses by Bornn, Shaddick and Zidek (2012). 


3. MATERN CROSS-COVARIANCE 
FUNCTIONS 

The Matern class of positive definite functions has 
become the standard covariance model for univari¬ 
ate helds (Gneiting and Guttorp, 2006). The pop¬ 
ularity in large part is due to the work of Stein 
(1999) who showed that the behavior of the covari¬ 
ance function near the origin has fundamental impli¬ 
cations on predictive distributions, particularly pre¬ 
dictive uncertainty. The key feature of the Matern 
is the inclusion of a smoothness parameter that di¬ 
rectly controls correlation at small distances. The 
Matern correlation function is 

M(h|i/, a) = ■j^(a||h||)MG(a||h||), h<EM d , 

where K u is a modified Bessel function of order u, 
a > 0 is a length scale parameter that controls the 
rate of decay of correlation at larger distances, while 
v > 0 is the smoothness parameter that controls be¬ 
havior of correlation near the origin. The smooth¬ 
ness parameter is aptly named as it implies levels of 
mean square differentiability of the random process, 
with large v yielding very smooth processes that 
are many times differentiable, and small v yield¬ 
ing rough processes; in fact there is a direct con¬ 
nection between the smoothness parameter and the 
Hausdorff dimension of the resulting random process 
(Goff and Jordan, 1988). 

Due to its popularity for univariate modeling, 
there is interest in being able to simultaneously 
model multiple processes, each of which marginally 
has a Matern correlation structure. To this end, 
Gneiting, Kleiber and Schlather (2010) introduced 
the so-called multivariate Matern model, where each 
constituent process is allowed a marginal Matern 
correlation, with Materns also composing the cross¬ 
correlation structures. In particular, the multivari¬ 
ate Matern implies 

pu(h) = M(h| i/j,Oi) and 

( 10 ) 

Pij{ h) =/0jjM(h|i/jj,ay), hGl d . 

Of course, this correlation structure can be coerced 
to a covariance structure by multiplying Cu( h) by 
of and CV, (h) by OiOj. Here, fi t j is a collocated cross¬ 
correlation coefficient, and represents the strength of 
correlation between Z{ and Zj at the same location, 
h = 0. 

The difficulty in (10) is deriving conditions on 
model parameters i/j, i/y, a*, a^- and that result 
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in a valid, that is, a nonnegative definite multivari¬ 
ate covariance class. In the original work, Gneiting, 
Kleiber and Schlather (2010) described two main 
models, the parsimonious Matern and the full bi¬ 
variate Matern. The parsimonious Matern is a re¬ 
duction in complexity over (10) in that a* = a^- = a 
are held at the same value for all marginal and 
cross-covariances, and the cross-smoothnesses are 
set to the arithmetic average of the marginals, = 
(z/j + Vj)/ 2. The model is then valid with an easy-to- 
check condition on the cross-correlation coefficient 

Pij ■ 

The flexibility of the parsimonious Matern is in 
allowing each process to have a distinct marginal 
smoothness behavior, and thus allowing for simulta¬ 
neous modeling of highly smooth and rough fields. 
The natural extension to allow distinct process- 
dependent length scale parameters at turns out to be 
more involved. The full bivariate Matern of Gneit¬ 
ing, Kleiber and Schlather (2010) allows for distinct 
smoothness and scale parameters for two processes 
(and in fact results in a characterization for p = 2). 
A second set of authors, Apanasovich, Genton and 
Sun (2012), were able to overcome the deficiencies of 
the parsimonious formulation for p > 2, introducing 
the flexible Matern. The flexible Matern works for 
any number of processes, allowing for each process to 
have distinct smoothness and scale parameters, and 
is as close in spirit to allowing entirely free marginal 
Matern covariances with some level of cross-process 
dependence as is currently available. A number of 
simpler sufficient conditions are available by us¬ 
ing scale mixtures (Reisert and Burkhardt (2007); 
Gneiting, Kleiber and Schlather (2010); Schlather 
(2010); Porcu and Zastavnyi (2011)). 

It is worth pointing out that the experimental 
results of both sets of authors, Gneiting, Kleiber 
and Schlather (2010) and Apanasovich, Genton and 
Sun (2012), highlighted the importance of allowing 
for highly flexible and distinct marginal covariance 
structures, while still allowing for some degree of 
cross-process correlation, and indeed the improve¬ 
ment over an independence assumption was sub¬ 
stantial. 

4. NONSTATIONARY CROSS-COVARIANCE 
FUNCTIONS 

Geophysical, environmental and ecological spatial 
processes often exhibit spatial dependence that de¬ 
pends on fixed geographical features such as terrain 


or land use type, or dynamical environments such as 
prevailing winds. In either case, the evolving nature 
of spatial dependence is not well captured by sta¬ 
tionary models, and thus the availability of nonsta¬ 
tionary constructions is desired, that is, models such 
that the marginal and cross-covariance functions are 
now dependent on the spatial location pair, not just 
the lag vector, cov{Zj(si), Zj(s 2 )} = CV,(si, s 2 ). 

Many of the aforementioned models have been 
extended to the nonstationary setup, including the 
original stationary models as special cases. The first 
natural extension to allowing the LMC to be non¬ 
stationary is to let the latent univariate correlations 
be nonstationary, so that 
r 

Cij{ si,s 2 ) = ^2p k (si,s 2 )A ik A jk , si,s 2 €l d , 

k= 1 

where now p k are nonstationary univariate correla¬ 
tion functions. The onus of deriving a matrix-valued 
nonstationary covariance function is then alleviated 
in favor of opting for univariate nonstationary corre¬ 
lations, of which there are many choices (e.g., Samp¬ 
son and Guttorp (1992); Fuentes (2002); Paciorek 
and Schervish (2006); Bornn, Shaddick and Zidek 
(2012)). Although this extension seems straightfor¬ 
ward, we are unaware of any authors who have im¬ 
plemented such an approach. The second way to ex¬ 
tend the LMC to a nonstationary setup is to al¬ 
low the coefficients to be spatially varying (Gelfand 
et ah, 2004), so that 

r 

Cij{ si,s 2 ) = ^2 Pk ( si - s 2 )A ik (s 1 )A jk (s 2 ), 

k =1 

si,s 2 € R d . This type of approach can be useful if 
the observed multivariate process is linked in a vary¬ 
ing way to some underlying and unobserved pro¬ 
cesses. Guhaniyogi et al. (2013) combined a low 
rank predictive process approach with the nonsta¬ 
tionary LMC for computationally feasible modeling 
with large datasets. 

The multivariate Matern was extended to the 
nonstationary case by Kleiber and Nychka (2012). 
The basic idea is to allow the various Matern pa¬ 
rameters, variance, smoothness and length scale, 
to be spatially varying (Stein (2005); Paciorek 
and Schervish (2006)), using normal scale mixtures 
(Schlather, 2010). For example, temperature fields 
exhibit longer range spatial dependence over the 
ocean than over land due to terrain driven nonsta- 
tionarity, and a nonstationary Matern with spatially 
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varying length scale parameter can capture this type 
of dependence without resorting to using disjoint 
models between ocean and land. In particular, the 
nonstationary multivariate Matern supposes 

Pn(si,s 2 ) oc M(si, s 2 |za(si, s 2 ),aj(si,s 2 )), 

Pij (Sl, S 2 ) OC /% (Si, S 2 )M(S 1 , S 2 1 Vij (Si, S 2 ), dij (si, s 2 )), 

si,s 2 € R d . An additional point here is that fi l3 (s, s) 
is proportional to the collocated cross-correlation 
coefficient cor{Zj(s), Zj(s)}, that is, the strength 
of relationship between variables at the same loca¬ 
tion. This strength often varies spatially, for exam¬ 
ple minimum and maximum temperature are less 
correlated over highly mountainous regions than 
over plains where they exhibit greater dependence. 
Kleiber and Genton (2013) considered an approach 
to allowing this correlation coefficient to vary with 
location in such a way that it can be included with 
any arbitrary multivariate covariance choice, as long 
as each process has a nonzero nugget effect (which 
is not usually restrictive, as most processes exhibit 
small scale dependence that are typically modeled 
as nugget effects). Other authors have noted similar 
phenomena with other scientific data (Fuentes and 
Reich (2013); Guhaniyogi et al. (2013)). 

Owing to the increasing complexity of nonsta¬ 
tionary and multivariate models and the expertise 
required to decide on a framework as well as im¬ 
plement an estimation scheme, a few authors have 
considered nonparametric approaches to estimation. 
Extending Oehlert (1993) and Guillot, Senoussi and 
Monestiez (2001) to the multivariate case, Jun et al. 
(2011) and Kleiber, Katz and Rajagopalan (2013) 
worked with a nonparametric estimator of multivari¬ 
ate covariance that is free from model choice and is 
available throughout the observation domain. The 
underlying idea is to kernel smooth the empirical 
method-of-moments estimate of spatial covariance 
in a way that retains nonnegative definiteness and 
yields covariance estimates at any arbitrary location 
pairs, not only those with observations. Their non¬ 
parametric estimators are variations on the form 

Cij (x,y) 

n n 

£5> A (||x-s fc ||) 

k =1 1=1 

■K x (\\y- Se \\)Z l (s k )Z j (s e ) | 


k= 1 1=1 ) 

x,y£ M rf , where K\(r) = K(r/X) is a positive kernel 
function with bandwidth A. The displayed equation 

(11) is set up for the case when is mean zero 

for i = 1 for instance representing residuals 

after a mean trend has been removed; the estima¬ 
tor can also be applied to centered residuals such as 
Zi(sfc) — Z % . This type of estimator can capture sub¬ 
stantial nonstationarity that may be difficult to pick 
up parametrically (Kleiber, Katz and Rajagopalan, 
2013). The nonparametric approach to estimation is 
primarily useful when replications of the multivari¬ 
ate random field are available. Although it can be 
applied when only a single held realization is avail¬ 
able, we caution against its use given the well-known 
variability of empirical estimates in small samples. 

The two methods of covariance and kernel convo¬ 
lution can also be extended to result in nonstation¬ 
ary matrix functions (Calder, 2007, 2008; Majum- 
dar, Paul and Bautista (2010)). As with the uni¬ 
variate case, the convolution integrals are often in¬ 
tractable and must be estimated numerically, and 
parametric interpretations are sometimes ambigu¬ 
ous. 

5. CROSS-COVARIANCE FUNCTIONS WITH 
SPECIAL FEATURES 

5.1 Asymmetric Cross-Covariance Functions 

All the stationary models described so far are 
symmetric, in the sense that C t j(h) = Cji( h), or 
equivalently, CV,(h) = Cij(— h). Although Cij( h) = 
Cji(— h) by definition, the aforementioned proper¬ 
ties may not hold in general. Li, Genton and Sher¬ 
man (2008) proposed a test of symmetry of the 
cross-covariance structure of multivariate random 
fields based on the asymptotic distribution of its em¬ 
pirical estimator. If the test rejects symmetry, then 
asymmetric cross-covariance functions are needed. 

Li and Zhang (2011) proposed a general approach 
to render any stationary symmetric cross-covariance 
function asymmetric. The key idea is to notice that 
if Cij (h) is a valid symmetric cross-covariance func¬ 
tion, then 

(12) Cj“(h) = Cij(h + — a.j), heR d , 

is a valid asymmetric cross-covariance function for 
any vectors a* € M d , i = 1,... ,p, such that a.; / 


( 11 ) 
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a j. Indeed, if Z(s) = {Z\(s), ..., Z p (s)} T has cross¬ 
covariance functions C\j(h), then {Zi(s — ai), ..., 
Z p ( s — a p )} T has cross-covariance functions (7“(h) 
given by (12), i,j = 1,... ,p. In particular, the con¬ 
struction (12) can be used to produce asymmetric 
versions of the LMC and the multivariate Matern 
models. The vectors ai,..., a p introduce delays that 
generate asymmetry in the cross-covariance struc¬ 
ture. Because only the differences a* — aj matter, one 

can impose a constraint such as ai -|-+ a p = 0 or 

ai = 0 to ensure identifiability. Li and Zhang (2011) 
proposed to first estimate the marginal parame¬ 
ters of (7“ (h) in (12), and then estimate the cross¬ 
parameters and p— 1 of the a^’s. Their simulations 
and data examples showed that asymmetric cross¬ 
covariance functions, when required, can achieve re¬ 
markable improvements in prediction over symmet¬ 
ric models. Apanasovich and Genton (2010) used 
a similar strategy to produce asymmetric spatio- 
temporal cross-covariance models based on latent 
dimensions; see Section 7.2. Inducing asymmetry in 
a nonstationary model is yet an open problem. 

5.2 Compactly Supported Cross-Covariance 
Functions 

Computational issues in the face of large datasets 
is a major problem in any spatial analysis, includ¬ 
ing likelihood calculations and/or co-kriging; see the 
review by Sun, Li and Genton (2012, Section 3.7). 
Especially, if the observation network is very large 
(even on the order of thousands), likelihood calcu¬ 
lations and co-kriging equations are difficult or im¬ 
possible to solve with standard covariance models, 
due to the dense unstructured observation covari¬ 
ance matrix. One approach to overcoming this dif¬ 
ficulty is to induce sparsity in the covariance ma¬ 
trix, either by using a compactly supported covari¬ 
ance function as the model, or by covariance taper¬ 
ing, that is, multiplying a compactly supported non¬ 
negative definite function against the model covari¬ 
ance (Furrer, Genton and Nychka (2006); Kaufman, 
Schervish and Nychka (2008)). Then sparse matrix 
methods can be used to invert the covariance ma¬ 
trix, or find the determinant thereof. 

Only recently have authors begun to consider this 
problem for multivariate random fields. Most of the 
currently available models are based on scale mix¬ 
tures of the form 

Cij (h) = j (1 - \\h\\/xY + gij{x)6x, h € R d , 


or variations on this theme (Reisert and Burk- 
hardt (2007); Porcu and Zastavnyi (2011)). Here, 
v > (d+ l)/2, and {gij(x)}\ - =1 forms a valid cross¬ 
covariance matrix function. The generality of this 
construction gives rise to many interesting exam¬ 
ples. For instance, with gij(x) = x u (l — x/b )^ 3 where 
7 ij = ( 7 i + 7 j )/2 and 7 * > 0 for all i = 1 ,... ,p we 
have the multivariate Askey taper 

/ IIV.II \ "+7«+l 

Cij( h) = &" +1 B(Ty + 1, 1 / + 1) (l - 

||h|| < b, and 0 otherwise, where B is the beta 
function (Porcu et al., 2013). Kleiber and Porcu 
(2015) provided a nonstationary extension of this 
model, while Porcu et al. (2013) considered simi¬ 
lar ideas for Buhmann functions and B-splines. Da¬ 
ley, Porcu and Bevilacqua (2015) obtained multi¬ 
variate Askey functions with different compact sup¬ 
ports bij and the multivariate analogue of Wend- 
land functions. The latter provide a tool for taper¬ 
ing cross-covariance functions such as the multivari¬ 
ate Matern. Recent results on equivalence of Gaus¬ 
sian measures of multivariate random fields by Ruiz- 
Medina and Porcu (2015) will allow for assessing the 
statistical properties of multivariate tapers. Du and 
Ma (2013) derived compactly supported classes of 
the Polya type. Although there has been a flurry 
of recent activity, much additional work remains 
in implementing these models in real world appli¬ 
cations, exploring covariance tapering and under¬ 
standing limitations of stationary constructions. 

5.3 Cross-Covariance Functions on the Sphere 

Many multivariate datasets from environmental 
and climate sciences are collected over large portions 
of the Earth, for example, by satellites and, there¬ 
fore, cross-covariance functions on the sphere § 2 in 
M 3 are in need. Consider a multivariate process on 
the sphere for which the zth variable is described 
by Zi(L,l), i = 1 ,...,p, with L denoting latitude 
and l denoting longitude. Jun (2011) constructed 
cross-covariance functions by applying differential 
operators with respect to latitude and longitude to 
the process on the sphere. Furthermore, Jun (2011) 
studied nonstationary models of cross-covariances 
with respect to latitude, so-called axially symmet¬ 
ric, and longitudinally irreversible cross-covariance 
functions for which 

cov{Zj(Li, li), Zj(L,2,12)} 

^cov{Z i (Li,Z 2 ),Z 7 -(L 2 ,ii)}, 
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€ § 2 , (L 2 ,h) £ § 2 - All the models described 
in Jun (2011) are valid for the chordal distance, 
that is, the Euclidean distance in M 3 between points 
on § 2 . Castruccio and Genton (2014) relaxed the 
assumption of axial symmetry for univariate ran¬ 
dom fields on the sphere and the extension of their 
work to multivariate random fields on the sphere 
remains an open problem. Gneiting (2013) pro¬ 
vided a very thorough study of positive definite 
functions on a sphere that can be used as covari¬ 
ances. Du, Ma and Li (2013) developed a char¬ 
acterization of isotropic and continuous variogram 
matrix functions on the sphere, extending some of 
the ideas of Ma (2012) who characterized contin¬ 
uous and isotropic covariance matrix functions on 
the sphere using Gegenbauer polynomials. Because 
the great circles are the geodesics on the sphere, 
they are the natural metric to measure distances in 
this context. Porcu, Bevilacqua and Genton (2014) 
developed cross-covariance functions of the great 
circle distances on the sphere. In particular, they 
studied multivariate Matern models as functions of 
the great circle distance on the sphere. Recently, 
Jun (2014) developed nonstationary Matern cross¬ 
covariance models whose smoothness parameters 
vary over space and with large-scale nonstationarity 
obtained with the aforementioned differential oper¬ 
ators. 

6. DATA EXAMPLES 

We illustrate a selection of the above cross-covari¬ 
ance models on two data examples. First, a set of re¬ 
analysis climate model output that represents spa¬ 
tially gridded data. Second, a set of observational 


temperature data that illustrates spatially irregu¬ 
larly located data. 

6.1 Climate Model Output Data 

The specific reanalysis dataset in use is a Na¬ 
tional Centers for Environmental Protection-driven 
(NCEP) run of the updated Experimental Climate 
Prediction Center (ECP2) model, which was origi¬ 
nally run as part of the North American Regional 
Climate Change Assessment Program (NARCCAP) 
climate modeling experiment (Mearns et al., 2009). 
Reanalysis data can be thought of as an estimate 
of the true state of the atmosphere for a given pe¬ 
riod. The variables we use are average summer tem¬ 
perature and cube-root precipitation (summer be¬ 
ing comprised of June, July and August; JJA) over 
a region of the midwest United States that is largely 
an agricultural region with relatively constant ter¬ 
rain. The cube-root transformation reduces skew¬ 
ness in the precipitation output and brings the dis¬ 
tribution closer to Gaussian. For each grid cell, we 
calculate a pointwise spatially varying mean as the 
arithmetic average of all 24 years of model output 
from 1981 through 2004. The data considered then 
are 24 years of residuals, having removed this spa¬ 
tially varying mean from each year’s reanalysis out¬ 
put for the two variables of temperature and cube- 
root precipitation. The residuals are assumed to be 
independent between years, and are additionally as¬ 
sumed to be realizations from a mean zero bivariate 
Gaussian process (both assumptions are supported 
by exploratory analysis). 

Figure 1 contains an example set of reanaly¬ 
sis residuals for the year 1989. By eye, it ap¬ 
pears that temperature residuals are smoother over 


0.2 

0.0 

* 0.2 

*0.4 


Fig. 1. Example residuals from 1989 after removing a spatially varying mean from NCEP-driven ECP2 regional climate 
model run for the variables of average summer temperature and precipitation. Units are degrees Celsius for temperature and 
centimeters for precipitation. 
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space, while precipitation is apparently rougher, 
while both seem to have similar correlation length 
scales. The two variables are strongly negatively 
correlated, with an empirical correlation coefficient 
of —0.67. This situation, with negative and strong 
cross-correlation and both variables exhibiting dis¬ 
tinct levels of smoothness, provides numerous chal¬ 
lenges to available cross-correlation models. Call 
T(s,t) and P(s,t) the temperature and precipita¬ 
tion residual at location s in year t, respectively (re¬ 
calling that, although indexed by year, the processes 
are viewed as temporally-independent). 

Of the above models, we compare six to an in¬ 
dependence assumption, that is, where temperature 
and precipitation residuals are assumed to be inde¬ 
pendent; for the independence model, each variable 
is assumed to follow a Matern covariance, and pa¬ 
rameters are estimated by maximum likelihood. The 
first nontrivial bivariate model is the parsimonious 
Matern, whose parameters we estimate by maxi¬ 
mum likelihood. The second model is a nearly full 
bivariate Matern, where we set the cross-covariance 
smoothness upp, T representing temperature and 
P precipitation, to be the arithmetic average of 
the marginal smoothnesses. For the full bivariate 
Matern, we set marginal parameters to be those of 
the independence model, and conditional on these, 
estimate the remaining cross-covariance length scale 
a tp and cross-correlation coefficient ptp by maxi¬ 
mum likelihood. We additionally consider two varia¬ 
tions on the bivariate parsimonious Matern, one us¬ 
ing a lagged covariance of Li and Zhang (2011) (see 
Section 5.1), and a nonstationary Matern with spa¬ 
tially varying variances for both variables. Spatially 
varying variances are estimated empirically at each 
grid cell, and conditional on these, the remaining pa¬ 
rameters are estimated by maximum likelihood. We 
also consider a linear model of coregionalization, 

T(s,f) = auZi(s,t), 

P(s,t) = a 12 Z 1 (s,t) + a 22 Z 2 (s,f), 


where Z\ and Z 2 are independent mean zero spa¬ 
tial processes with Matern covariances. We opt for 
this formulation since temperature is expected to 
be smoother than precipitation, and our goal is to 
preserve this feature within the statistical model. 
Parameters are estimated by maximum likelihood. 
Finally, we additionally consider two latent dimen¬ 
sional models. The first is parameterized by (9), ex¬ 
cept without a nugget effect, and the second is built 
via 

T(s,t) = &nZ(s,t) + 612^1 (s,i), 

P(s,t) = b 2 iZ(s,t) + b 2 2 Z 2 (s,t), 

where Z(s,t) has a latent dimensional covariance of 
the form 

C{h) ” (IK, -<*11 +O' 5 exp { (lift - GI +1)4’ 

h 6 R 2 , and Z \, Z 2 are independent with Matern 
correlations. This choice for Z allows the temper¬ 
ature process to retain smoother behavior at the 
origin than precipitation, whereas the model of (9) 
forces exponential-like behavior at the origin. 

Table 1 contains the parsimonious and full bivari¬ 
ate Matern parameter estimates. Note the smooth¬ 
ness parameter of the temperature field is approx¬ 
imately 1.3, indicating a relatively smooth field, 
which supports the theoretical analysis of North, 
Wang and Genton (2011); on the other hand, pre¬ 
cipitation has a smoothness of approximately 0.55, 
suggesting an exponential model may work well. 
Both variables have similar length scale parameters, 
which suggests the assumptions of the parsimonious 
Matern model may be reasonable for this particu¬ 
lar dataset. The cross-correlation coefficient is es¬ 
timated to be strongly negative in both cases, with 
the full Matern slightly closer to the empirical cross¬ 
correlation. 

Table 2 contains log likelihood values for the vari¬ 
ous models considered. Evidently, the parsimonious, 
full and parsimonious lagged Matern all have likeli¬ 
hood values on the same order, which are all superior 


Table 1 

Maximum likelihood estimates of parameters for full and parsimonious bivariate Matern models, applied to the NARCCAP 
model data. Units are degrees Celsius for temperature, centimeters for precipitation, and kilometers for distances 


Model 

(T t 

(Tp 

1ST 

Up 

1/ttT 

1/ap 

1/cltp 

Ptp 

Full 

1.63 

0.19 

1.31 

0.55 

384.3 

361.6 

420.1 

-0.60 

Parsimonious 

1.61 

0.19 

1.33 

0.54 

367.1 

- 

- 

-0.49 
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Table 2 

Comparison of log likelihood values and pseudo cross-validation scores averaged over ten cross-validation replications for 
various multivariate models on the NARCCAP model data residuals for temperature (T) and precipitation (P) 



Log likelihood 

RMSE (T) 

CRPS (T) 

RMSE (P) 

CRPS(P) 

Nonstationary parsimonious Matern 

53564.5 

0.168 

0.084 

0.085 

0.047 

Parsimonious lagged Matern 

52563.7 

0.179 

0.090 

0.087 

0.048 

Full Matern 

52560.1 

0.178 

0.090 

0.087 

0.048 

Parsimonious Matern 

52556.9 

0.179 

0.090 

0.087 

0.048 

Latent dimension 

52028.8 

0.180 

0.091 

0.088 

0.049 

LMC 

51937.0 

0.179 

0.091 

0.090 

0.050 

Independent Matern 

50354.5 

0.180 

0.091 

0.088 

0.049 

Latent dimension of (9) 

48086.3 

0.195 

0.100 

0.088 

0.048 


to the LMC, independent Matern and latent dimen¬ 
sional models. We remark that, given the smooth na¬ 
ture of the temperature field, the latent dimensional 
model of (9) is not expected to perform as well, as 
it fixes the smoothness of the temperature field at 
v = 0.5, while on the other hand the latent dimen¬ 
sional model using a shared process with squared ex¬ 
ponential covariance performs nearly as well as the 
Matern alternatives. The nonstationary extension of 
the parsimonious Matern exhibits the largest log 
likelihood, improving the next best model by over 
1000. This suggests that the bivariate field indeed 
exhibits nonstationarity, and there may be other 
modeling improvements that can be explored with 
new nonstationary cross-covariance developments. 

Finally, we perform a small pseudo cross-validation 
study. We hold out the bivariate model output at a 
randomly chosen 90% of spatial locations consistent 
over all time points. We then co-krige the remaining 
10% (62 locations) to the held out grid cells using 
parameter estimates based on the entire dataset. As 
the residual process is assumed to be independent 
between years, co-kriging is performed separately 
for each year. Root mean squared error (RMSE) 
and the continuous ranked probability score (CRPS) 
are used to validate interpolation quality, averaged 
over all held out locations and years. We repeat 
this experiment ten times for different randomly 
chosen sets of held out spatial locations and av¬ 
erage the resulting scores; the results are displayed 
in Table 2. Generally speaking, all models are ef¬ 
fectively equivalent in terms of predictive ability, 
except for the nonstationary extension to the par¬ 
simonious Matern, which appears to improve both 
predictive quantities for temperature especially. Per¬ 
haps surprisingly, the independent Matern performs 
as well for interpolation, although this has not been 


the case with all datasets (Gneiting, Kleiber and 
Schlather, 2010). 

6.2 Observational Temperature Data 

The second example we consider is a bivariate 
minimum and maximum temperature observational 
dataset. Observations are available at stations that 
are part of the United States Historical Climatology 
Network (Peterson and Vose, 1997) over the state of 
Colorado. Stations in the USHCN form the highest 
quality observational climate network in the United 
States; observations are subject to rigorous quality 
control. 

We consider bivariate daily temperature residu¬ 
als (i.e., having removed the state-wide mean) on 
September 19, 2004, a day which has good network 
coverage with observations being available at 94 sta¬ 
tions. Exploratory Q-Q plots suggest the residuals 
are well modeled marginally as Gaussian processes; 
we suppose the bivariate process is a realization from 
a bivariate Gaussian process with zero mean. 

We entertain the same set of bivariate models as 
in the previous example subsection. Due to the fact 
that the data are observational, we augment each 
process’ covariance with a nugget effect. We be¬ 
gin by estimating the independent Matern model 
separately for both minimum and maximum tem¬ 
perature residuals by maximum likelihood. Since 
the nugget effect is tied to marginal process be¬ 
havior, we fix the estimated nugget effects at their 
marginal estimates, and estimate all other covari¬ 
ance parameters from the remaining bivariate mod¬ 
els by maximum likelihood, conditional on these 
marginal nugget estimates. We remove both the bi¬ 
variate Matern and nonstationary model from con¬ 
sideration, as these are both difficult to estimate 
given a single realization of the spatial process. 
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Table 3 

Comparison of log likelihood values and pseudo cross-validation scores averaged over 100 cross-validation replications for 
various multivariate models on the USHCN observed temperature residuals for maximum temperature (max) and minimum 

temperature (min) 



Log likelihood 

RMSE (min) 

CRPS (min) 

RMSE (max) 

CRPS (max) 

Parsimonious lagged Matern 

-414.0 

3.18 

1.83 

3.14 

1.79 

Parsimonious Matern 

-414.9 

3.22 

1.85 

3.16 

1.80 

LMC 

-415.7 

3.22 

1.85 

3.16 

1.80 

Latent dimension 

-416.2 

3.23 

1.86 

3.18 

1.81 

Latent dimension of (9) 

-419.1 

3.24 

1.86 

3.17 

1.81 

Independent Matern 

-427.6 

3.41 

1.94 

3.35 

1.91 


On top of comparing in sample log likelihood 
values, we additionally consider a pseudo cross- 
validation study, leaving out a randomly selected 
25% of locations, and co-krige the remaining bivari¬ 
ate observations to these held out locations. This 
pseudo cross-validation procedure is repeated 100 
times, and Table 3 contains the averaged scores from 
this study. Contrasting with the results of the NAR- 
CCAP example, we now see the predictive bene¬ 
fit of considering multivariate second-order struc¬ 
tures. Generally, predictive RMSE and CRPS are 
improved by between 6-7% when co-kriging using 
the parsimonious lagged Matern, as compared to 
marginally kriging each variable. A potential expla¬ 
nation for the improvement here as compared to the 
NARCCAP example is that in the current study, the 
observations are subject to measurement error, and 
thus the greater uncertainty in estimating the bi¬ 
variate surface is more readily quantified using an 
appropriate bivariate covariance model. 

7. DISCUSSION 

7.1 Specialized Cross-Covariance Functions 

The models introduced so far cover the broad 
majority of usual datasets requiring multivariate 
models. However, specialized scenarios sometimes 
arise, and call for novel developments. For instance, 
some constructions involve modeling variables that 
exhibit long range dependence. Ma (2011c) exam¬ 
ined a construction for all variables having long 
or short range dependence utilizing univariate vari- 
ograms; and Ma (2011a) explored the relationship 
between multivariate covariances and variograms. 
Kleiber and Porcu (2015) derived a nonstationary 
construction that allows individual variables to be a 
spatially varying mixture of short and long range 


dependence, as well as having substantial cross¬ 
correlation between variables (with possibly oppos¬ 
ing short/long range dependence); their construc¬ 
tion is a special case of a multivariate generalization 
of the univariate Cauchy class of covariance (Gneit- 
ing and Schlather, 2004). Hristopoulos and Porcu 
(2014) defined the multivariate analogue of Spartan 
Gibbs random fields, obtained through using Hamil¬ 
tonian functionals. 

Ma (2011b) also studied various approaches to 
produce valid cross-covariance functions based on 
differentiation of univariate covariance functions 
and on scale mixtures of covariance matrix func¬ 
tions. Alternatively, Ma (201 Id) provided construc¬ 
tions of variogram matrix functions, and Du and 
Ma (2012) introduced an approach to building var¬ 
iogram matrix functions based on a univariate vari¬ 
ogram model. 

We close this section by pointing out a recent 
novel approach to generating valid matrix covari¬ 
ances by considering stochastic partial differential 
equations (SPDEs); Hu et al. (2013) used systems 
of SPDEs to simultaneously model temperature and 
humidity, yielding computationally efficient means 
to analysis by approximating a Gaussian random 
field by a Gaussian Markov random field. 

7.2 Spatio-Temporal Cross-Covariance Functions 

So far, the cross-covariance models that we de¬ 
scribed were aimed at spatial multivariate random 
fields. When adding the time dimension, the re¬ 
sulting spatio-temporal multivariate random field, 
Z(s,f), has stationary cross-covariance functions 
Cij (h, u ), where u denotes a time lag. All the previ¬ 
ous spatial cross-covariance models can be straight¬ 
forwardly extended to the spatio-temporal setting, 
for example, Rouhani and Wackernagel (1990), Choi 
et al. (2009), Berrocal, Gelfand and Holland (2010) 
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and De Iaco et al. (2013), De Iaco, Palma and Posa 
(2013) developed space-time versions of the linear 
model of coregionalization. Gelfand, Banerjee and 
Gamerman (2005) used a dynamic approach for mul¬ 
tivariate space-time data using coregionalization. 

Based on the concept of latent dimensions de¬ 
scribed in Section 2.3, Apanasovich and Genton 
(2010) have extended a class of spatio-temporal 
covariance functions for univariate random fields 
due to Gneiting (2002) to the multivariate setting. 
Specifically, if </?i(f), t > 0, is a completely monotone 
function and Vt {t ), ip?(t), t> 0, are positive func¬ 
tions with completely monotone derivatives, then 


C(h,u,v) 

(13) 


CT" 


[V’l{^ 2 M(||v|| 2 )}]' i / 2 {^2(||v|| 2 )} 1 / 2 

llhll 2 


■<P 1 


.VtI^AMIMI 2 )} 


is a valid stationary covariance function on K rf + 1 +^' 
that can be used to model cross-covariance func¬ 
tions with v = — £ •. When ip 2 (i) = 1, Gneiting’s 

class is retrieved. The case v = 0 yields a common 
spatio-temporal covariance function for each vari¬ 
able that can be made different through a LMC-type 
construction. Also judicious choices of the functions 
in (13) allow one to control nonseparability between 
space and time, between space and variables, and 
between time and variables; see Apanasovich and 
Genton (2010) for various illustrative examples. 

To further introduce asymmetry in spatio-temporal 
cross-covariance functions, Apanasovich and Genton 
(2010) have proposed two approaches based on la¬ 
tent dimensions. Using the notation of Section 2.3, 
the first type of asymmetric spatio-temporal cross¬ 
covariance is 


(14) C$(h,«) = C(h,u- Aj(£j - £,-), & - ^), 

h £ M rf , u G R, where C is a valid covariance function 
on M. d+k of a univariate random field and Ag G M fc , 
1 < k < p, controls the delay in time that creates 
asymmetry. There is no time delay if and only if 
Ag = 0 or i = j. The second type of asymmetric 
spatio-temporal cross-covariance is 


(15) C7?-(h,«) = C(h - 7 h u,u,ii - ^ - 7$«), 

h G M. d ,u G R, where the velocity vectors 7 ^ G M. d 
and 7 ^ G R fc are responsible for the lack of symme¬ 
try. When u / 0, this model is spatially anisotropic. 
Combinations of models (14) and (15) are possible. 


7.3 Physics-Constrained Cross-Covariance 
Functions 

Especially for geophysical processes, often there 
are physical constraints on a system of variables that 
must be obeyed by any stochastic model. For in¬ 
stance, Buell (1972) explored valid covariance mod¬ 
els for geostrophic wind that must satisfy physical 
relationships for isotropic geophysical flow includ¬ 
ing geopotential, longitudinal wind components and 
transverse wind components. 

In a similar vein, a number of physical processes, 
especially in fluid dynamics, involve fields with spe¬ 
cialized restrictions such as being divergence free. 
Scheuerer and Schlather (2012) developed matrix¬ 
valued covariance functions for divergence-free and 
curl-free random vector fields, which are based on 
combinations of derivatives of a specified variogram 
and extend earlier work by Narcowich and Ward 
(1994). 

Constantinescu and Anitescu (2013) introduced 
a framework for building valid matrix-valued co- 
variance functions when the constituent processes 
have known physical constraints relating their be¬ 
havior. By approximating a nonlinear physical rela¬ 
tionship between variables through series expansions 
and closures, the authors develop physically-based 
matrix covariance classes. They explored large-scale 
geostrophic wind as a case study, and illustrated 
that physically motivated cross-correlation models 
can substantially outperform independence models. 

North, Wang and Genton (2011) studied spatio- 
temporal correlations for temperature fields arising 
from simple energy-balance climate models, that 
is, white-noise-driven damped diffusion equations. 
The resulting spatial correlation on the plane is of 
Matern type with smoothness parameter u = 1, al¬ 
though rougher temperature fields are expected due 
to terrain irregularities for example. Derivations for 
temperature fields on a uniform sphere were pre¬ 
sented as well. Whether these results can be ex¬ 
tended to other variables such as pressure and wind 
fields, and possibly lead to Matern cross-covariance 
models of type (10), is an open question. 

7.4 Open Problems 

Finally, there are many open problems that call 
for more research. The most fundamental question 
is the theoretical characterization of the allowable 
classes of multivariate covariances. For instance, 
given two marginal covariances, what is the valid 
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class of possible cross-covariances that still results in 
a nonnegative definite structure? Such a character¬ 
ization is an unsolved problem. Additional to char¬ 
acterization, the companion theoretical question is 
the utility of cross-covariance models. Given the two 
data examples in this review, a natural question is: 
for the purposes of co-kriging, in what situations are 
the use of nontrivial cross-covariances beneficial? Al¬ 
though it is traditional to focus on kriging and co- 
kriging in the geostatistical literature, we wish to 
additionally emphasize the utility of these models 
for simulation of multivariate random fields. Indeed, 
without flexible cross-covariance models, it is im¬ 
possible to simulate multiple fields with nontrivial 
dependencies. 

The power exponential class of covariances is a 
useful marginal class of covariances, but to the best 
of our knowledge, a characterization of parameters 
for the validity of the multivariate version 

Aj( h ) = Aj exp | - j, heM d , 

is not known. Although we believe that the mul¬ 
tivariate Matern model (10) has more flexibility, 
this is still an interesting question, especially as this 
set of covariances requires no calculations involving 
Bessel functions. 

The extension of spatial extremes to the case of 
multiple variables has not been explored yet except 
for the recent proposal of Genton, Padoan and Sang 
(2015) who considered multivariate max-stable spa¬ 
tial processes. The aim of that research is to de¬ 
scribe the behavior of extreme events of several vari¬ 
ables across space, such as extreme rainfall and ex¬ 
treme temperature for example. This requires flexi¬ 
ble and physically-realistic cross-covariance models 
and therefore the families described herein may play 
an important role for such applications. 

Recently, there has been some new interest in 
other types of random fields than the usual Gaus¬ 
sian case. Mittag-Leffler fields contain the Gaussian 
case as a subset, but are specified in terms of an 
infinite series expansion that is unwieldy for appli¬ 
cations (Ma, 2013b). Another option is a multivari¬ 
ate extension of the Student’s t distribution, a t- 
vector distribution (Ma, 2013a); these seem to be 
more promising for applications, and some explo¬ 
ration of the utility of these types of models is called 
for. Finally, hyperbolic vector random fields contain 
the Student’s t as a limiting case, although model in¬ 
terpretation, estimation and implementation remain 
unexplored (Du et al. (2012)). 


There is also a need for valid multivariate cross¬ 
covariance functions for spatial data on a lattice. Al¬ 
though one can apply any of the models mentioned 
in this manuscript to lattice data, the extension of 
univariate Markov random field models is another 
route. For instance, Gelfand and Vounatsou (2003) 
have studied proper multivariate conditional autore¬ 
gressive models. Daniels, Zhou and Zou (2006) pro¬ 
posed a class of conditionally specified space-time 
models for multivariate processes geared to situa¬ 
tions where there is a sparse spatial coverage of one 
of the processes and a much more dense coverage of 
the other processes. This is motivated by an appli¬ 
cation to particulate matter and ozone data. Sain 
and Cressie (2007) also developed Markov random 
field models for multivariate lattice data. 

Many additional open questions remain, includ¬ 
ing theoretical development of estimation in the 
multivariate context (Pascual and Zhang, 2006). 
Vargas-Guzman, Warrick and Myers (1999) looked 
at the relationship between support size and rela¬ 
tionship between variables, but relatively few have 
explored this phenomenon in the multivariate case. 
Finally, there is a need to better understand and ex¬ 
plore the intimate connection between multivariate 
spline smoothers, co-kriging and multivariate nu¬ 
merical analysis (Beatson, zu Castell and Schrodl 
(2011); Fuselier (2008); Narcowich and Ward (1994); 
Reisert and Burkhardt, 2007). 
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